Automatic Lexical Acquisition Based on Statistical Distributions

نویسندگان

  • Suzanne Stevenson
  • Paola Merlo
چکیده

We automatically classify verbs into lexical semantic classes, based on distributions of indicators of verb alternations, extracted from a very large annotated corpus. We address a problem which is particularly di cult because the verb classes, although semantically di erent, show similar surface syntactic behavior. Five grammatical features are su cient to reduce error rate by more than 50% over chance: we achieve almost 70% accuracy in a task whose baseline performance is 34%, and whose expert-based upper bound we calculated at 86.5%. We conclude that corpus-driven extraction of grammatical features is a promising methodology for ne-grained verb classi cation.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The Effect of Interaction on Lexical Acquisition

This research showed that appropriate input and suitable contexts for interaction among students can lead to successful  second language acquisition (SLA). This study based on Swain's (2005) notion of collaborative dialogue, aimed to study whether EFL learners participating in negotiation of meaning based tasks collaborate with each other and, if so, to investigate the role of this behavior in ...

متن کامل

Automatic Verb Classification Using Distributions of Grammatical Features

We apply machine learning techniques to classify automatically a set of verbs into lexical semantic classes, based on distributional approximations of diathe-ses, extracted from a very large annotated corpus. Distributions of four grammatical features are sufficient to reduce error rate by 50% over chance. We conclude that corpus data is a usable repository of verb class information, and that c...

متن کامل

Automatic Verb Classiication Using Distributions of Grammatical Features

We apply machine learning techniques to classify automatically a set of verbs into lexical semantic classes, based on distributional approximations of diathe-ses, extracted from a very large annotated corpus. Distributions of four grammatical features are suucient to reduce error rate by 50% over chance. We conclude that corpus data is a usable repository of verb class information, and that cor...

متن کامل

Automatic Verb Classification Based on Statistical Distributions of Argument Structure

Automatic acquisition of lexical knowledge is critical to a wide range of natural language processing tasks. Especially important is knowledge about verbs, which are the primary source of relational information in a sentence--the predicate-argument structure that relates an action or state to its participants (i.e., who did what to whom). In this work, we report on supervised learning experimen...

متن کامل

The Correlation of Machine Translation Evaluation Metrics with Human Judgement on Persian Language

Machine Translation Evaluation Metrics (MTEMs) are the central core of Machine Translation (MT) engines as they are developed based on frequent evaluation. Although MTEMs are widespread today, their validity and quality for many languages is still under question. The aim of this research study was to examine the validity and assess the quality of MTEMs from Lexical Similarity set on machine tra...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2000